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Abstract 

We propose modified frequentist definitions for the determination of confidence 

intervals for the case of Poisson statistics. We require that 1-/3 > p ( n \ X ) > 

Jl'. > ... . 

a . We show that this definition is equivalent to the Bayesian method with prior 

7r(A) ~ X k . Other generalizations are also considered. In particular, we propose 

modified symmetric frequentist definition which corresponds to the Bayes approach 

with the prior function vr(A) ~ — (1 H — ). Modified frequentist definitions for the 

2 A 

case of nonzero background are proposed. 
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1 Introduction 



In high energy physics one of the standard problems p] is the determination of the con- 
fidence intervals for the parameter A in Poisson distribution 

P(n\X) = — exp(-A). (1) 

There are two methods to solve this problem - the frequentist and the Bayesian. 
In Bayesian method [TJ [2] due to Bayes theorem 

the probability density for the A parameter is determined as 

/ -i I x P(n obs \X)7v(X) 

Here 7r(A) is the prior function and in general it is not known that is the main problem 
of the Bayesian method. Formula (2) reduces the statistics problem to the probability 
problem. At the (1 — a) probability level the parameters X up and Xd OW n are determined 
from the equation 3 

p(X\n obs )dX = 1 - a (3) 

down 

and the unknown parameter A lies between Xdown and X up with the probability 1 — a. The 
solution of the equation (3) is not unique. One can define 

p(X\n obs )dX = a , (4) 



do wn 



o 



p(X\n obs )dX = /3 . (5) 
In general the parameters a and {3' are arbitrary except the evident equality 

a + (3 = a . (6) 

The most popular are the following options [1] : 
1- X down = - upper limit. 



1 Usually a is taken equal to 0.05. 



2. X up = 00- lower limit. 

3. fv down p(\\n obs )d\ = J^ p p(X\n obs )d\ = f - symmetric interval. 

4. The shortest interval - p(X\n b s ) inside the interval is bigger or equal to p(X\n b s ) 
outside the interval. 

In frequentist approach the Neyman belt construction [3] (see Fig. 1 jl]) is used for 
the determination of the confidence intervals. 
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Figure 1: Neyman belt construction. 



For the continuous observable —00 < x < 00 with the probability density f(x, A) □ we 
require that 

f{x, X)dx = 1 — a , (7) 



•^dov 



or 

/•oo 

f(x, X)dx = (5 , 



2 Here A is some unknown parameter and f{x, X)dx = 1. 



•^'down ( 

f(x, X)dx = a , (9) 

-oo 

at+0 = a. (10) 
The equations^ 

/•DO 

f{x,\ down )dx = ft , (11) 
f(x, X up )dx = a (12) 

determine the interval of possible values Xdown < A < A up of the parameter A at the (1 — at) 
confidence level. 

For Poisson distribution P(n\X) the analog of the equation (7) has the form 

P(n\X)>l-a. (13) 

(A) 

The equations for the determination of Xd OW n and X up (analogs of the equations (11, 
12 ) ) have the form [51 El CZ] 

oo 

£ P(n\X down ) = (3' , (14) 
£P(n|A up ) = a. (15) 

n=0 

As a consequence of the equations (14, 15) we find that for X up = Xdown the probability 
1 — a — = —P(n f, s \X up ) < that contradicts to our intuition that the probability 
P (Xdown < A < X up ) — 1 — a — -)■ for X d0 wn ->■ A up , i.e. a + -> 1. For the case of 
continuous random variable x with smooth probability density f(x, A) as a consequence 
of the equations (11,12) for X d0 wn X up the evident limit a + — > 1 takes place. 

In this paper □ we propose the modified frequentist definitions of confidence interval 
for the case of Poisson distribution. We show that the modified frequentist definitions are 
equivalent to the Bayesian approach. The organization of the paper is the following. In 
Section 2 we propose modified frequentist definitions of confidence inteval and show its 
equvalence to the Bayes method. In Section 3 we discuss the case of nonzero background. 
Section 4 contains concluding remarks. 



3 Here x b s is the observed value of random variable x. 
4 The main results of this paper are contained in ref. [5]. 



2 Modified frequentist definitions of the confidence 
interval 

For the case of continuous random variable x the equations (11,12) are equivalent to the 
equations 

/■oo 

f(x,X down )dx = ft , (16) 

^obs 
OO 



or to the equations 



poo 

/ f(x,X up )dx = 1- a (17) 

J x obs 



x obs ( 

f(x, \ up )dx = a , (1£ 



OC 



x obs 



f{x, Xdown)dx = 1 - . (19) 

One can find that the inequalities 



1-/3' > [ X ° bS f(x, \)dx > a (20) 



oo 



and 



/•oo 

1-a' > / f{x,X)dx > ft (21) 

are equivalent and they determine the interval of possible values Xdown < A < A up (see 
eqs.(ll,12 )) at the (1 — a) confidence level. 

For Poisson distribution P(n\X) in closed analogy with the inequalities (20,21) we 
require that Q 

1 - > P-(n obs \\) > a (22) 

or 

1 - a > P + (n obs \X) > ft , (23) 

where 

P-(n obs \X) ="f^ P(n\X) , (24) 



n=0 



5 We can consider the inequalities (22,23) as modified frequentist definitions for the determination of 
confidence intervals. 



P + (n obs \X)= E P H A )- (25) 

n=n obs 

For Poisson distribution the inequalities (22) and (23) lead to the equations 

P-{n obs \\ down ) = 1 - , (26) 
P-{n obs \\ up ) = a (27) 



and 



P+{n obs \\ down ) = (3 , (28) 

P+(n obs \X up ) = 1 - a (29) 

for the determination of Xdown and X up . As we mentioned before the choice of Xdown and 
X up is not unique. Probably the most natural choice is the use of the ordering principle. 
According to this principle we require that the probability density P(n obs \X) inside the 
confidence interval [Xdown, X up ] is bigger or equal to the probability density outside this 
interval. For Poisson distribution this requirement leads to the formula 

P\1T>obs | Xdown) Pyt>obs 

\X up ) (30) 

for the determination of X up and Xdown- For such ordering principle a and are not 
independent quantities. It is natural to use a = a + as a single free parameter. 

Unlike to the case of continuous variable the equations (14, 15), (26, 27) and (28, 29) 
are not equivalent for the discrete variable n and they differ in the presence or absence 
of P(n obs \X uP: down) in some equations. For instance, for — 0, a — a (upper limit case) 
the equations (15) and (27) coincide and read as 

n obs 

E P{n\X up ) = a, (31) 

n=0 

while the equation (29) is equivalent to 

E P(n\X up ) = a, (32) 

n=0 

For n obs = 3 and a = 0.05 we find that 

A < 7.75 , (Eq. 31) , (33) 



A < 6.30 , (Eq. 32) . (34) 

Due to the identity \7\ 

r°° , , 

P-{n obs \X) = / P(n obs \X)dX (35) 

J X 

the confidence interval [Xdown, X up ] for the modified frequentist definition (22) is deter- 
mined from the equations 

, f°° ii 
a = P(n obs \X )dX , (36) 

(3 = / P(n obs \X)dX . (37) 
Jo 

The parameter A lies in the interval 

Xdown < A < X up (38) 

with the probability (1 — a — 0). So we see that our modified frequentist definition (22) 
is equivalent to Bayes definitions (3, 4, 5) with flat prior 7r(A) = 1, namely: 

/ P(n obs \X)dX = 1 -a - /3 . (39) 

Xdown 

One can show that our modified frequentist definition (23) (eqs. (28,29)) is equivalent to 
the Bayes approach with the prior function 7r(A) ~ v. 

The coverage of the definition (22) means the following. For a hypothetical ensemble 
of similar experiments the probability to observe the number of events n < n obs satisfies 
the inequalities (22). 

Note that the equations for the determination of an upper limit X up in frequentist and 
modified frequentist approach (22) coincide whereas the equations for the determination 
of lower limit are different. Namely, the equation (26) is equivalent to the equation 

oo 

E P(n\X down ) = (3' . (40) 

n=n obg +l 

Classical frequentist equation (15) for the determination of X up is equivalent to Bayes 
equation (4) with flat prior while the equation (14) for the determination of Xd OW n is 
equivalent to the Bayes equation (5) with the prior 7r(A) ~ k. 



It is possible to generalize our modified frequentist definition (22), namely: 

1-0 >P_(n obs \X;k)>a , (41) 

where 

P4n obs \X;k)= J2 P NA) (42) 

ra=0 

and k = 0, ±1, ±2, ... 

One can find that definition (41) leads to Bayes equations (4, 5) with the prior function 
7r(A) ~ X h . The cases k = and k = — 1 are equivalent to the inequalities (22) and (23). 
Upper limits for three values of k — 0, ±1 are shown in Table [2] (a = 0.1), in Table [2] 
(a = 0.05) and, correspondingly, in Fig. 2 and Fig. 3. 

Table 1: Upper limits (X up ) for confidence level 90% (a = 0.1). 





k=-l 


k=0 


k=+l 







2.30 


3.89 


1 


2.30 


3.89 


5.32 


2 


3.89 


5.32 


6.68 


3 


5.32 


6.68 


7.99 


4 


6.68 


7.99 


9.27 


5 


7.99 


9.27 


10.53 


6 


9.27 


10.53 


11.77 


7 


10.53 


11.77 


12.99 


8 


11.77 


12.99 


14.21 


9 


12.99 


14.21 


15.41 


10 


14.21 


15.41 


16.60 
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Table 2: Upper limits (A up ) for confidence level 95% (a = 0.05). 





k=-l 


k=0 


k=+l 





- 


3.00 


4.74 


1 


3.00 


4.74 


6.30 


2 


4.74 


6.30 


7.75 


3 


6.30 


7.75 


9.15 


4 


7.75 


9.15 


10.51 


5 


9.15 


10.51 


11.84 


6 


10.51 


11.84 


13.15 


7 


11.84 


13.15 


14.43 


8 


13.15 


14.43 


15.71 


9 


14.43 


15.71 


16.96 


10 


15.71 


16.96 


18.21 
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upper limits 




obs 



Figure 3: Upper limits (A up ) for confidence level 95% (a = 0.05), k = —1, 0,+l. 

We can further generalize definitions (41, 42) by the introduction 

P~(n obs |A; c k ) = c l p - ( n obs | A; k) , (43) 
k 

where J2k c l = 1- Again we require that 

l-0>P-{n obs \\;c k )>a. (44) 



One can find that our definition (43, 44) is equivalent to Bayes approach with prior 
function 

*-(A)~£c^A fc , (45) 

k 

where 

lk = 7 — TT\T • ( 46 ) 
{n + k)\ 

Note that in modified frequentist inequalities (22, 23) the term P{n obs \\ contributes 
in (22) and (23) that leads to nonequivalence of these inequalities. One of the possible 
symmetric generalizations of the modified frequentist inequalities (22,23) looks as follows 



1-0 > P-{n obs \\) - ^P(n obs \\) > a 



(47) 
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Table 3: Upper limits (X up ) for confidence level 90% (a = 0.1). 





tt(A)~!(1 + ^) 







2.30 


1.35 


1 


3.27 


3.12 


2 


4.72 


4.61 


3 


6.10 


6.00 


4 


7.57 


7.34 


5 


8.71 


8.63 


6 


9.97 


9.90 


7 


11.21 


11.15 


8 


12.44 


12.38 


9 


13.65 


13.60 


10 


14.85 


14.80 



1 - a > P + (n obs \X) - -P(n obs \X) > /?' . (48) 

The inequalities (47) and (48) are equivalent to each other and moreover they are equiv- 
alent to the Bayes approach with the prior function 

^(A)~~(l + ^f). (49) 

Upper limits for the prior ( I4"9j) and for the Jeffreys prior |9J 7r(A) ~ ^= can be found 
in Table [2] (a = 0.1) and in Table El (a = 0.05). 

3 The case of nonzero background 

For nonzero background the parameter A is represented in the form 

\ = b+s. (50) 
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Table 4: Upper limits (A„ p ) for confidence level 95% (a = 0.05). 





,r(A)~ §(1 + 25*) 







3.00 


1.92 


1 


4.11 


3.90 


2 


5.68 


5.53 


3 


7.16 


7.03 


4 


8.57 


8.45 


5 


9.93 


9.83 


6 


11.27 


11.18 


7 


12.58 


12.49 


8 


13.87 


13.79 


9 


15.14 


15.07 


10 


16.40 


16.33 



Here b > is known background and s is unknown signal. In Bayes approach the gener- 
alization of the formula (2) reads 

/ i in P(n obs \b + s)ir(b,s) , 

p{s n obs , b) = — — , . 51 

Jo P(n ofes |6 + s )7r(6, s )rfs 

For flat prior we find 

* 4|w) = rKt')lr (52) 

The main effect of nonzero background is the appearance of the factor 

K(n obs ,b) = / P(n obs \X)d\ (53) 

Jb 

in the denominator of the formula (52). For zero background K(n obs , b — 0) = 1. One can 
interpret the appearance of additional factor K(n obs , b) in terms of conditional probability. 
Really, for flat prior the P(n obs , X)d\ is the probability that parameter A lies in the interval 
[A, X+dX]. For the case of nonzero background b parameter A = b+s > b. The probability 
that A > b is equal to p(X > b\n obs ) = K(n obs , b). The conditional probability that A lies 
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in the interval [A, A + dX] provided A > b is determined by the standard formula 

/x i, s , u , P{n obs \X)d\ P{n obs \X)dX 

p{X, n obs \X > b)dX = — /» , n = —F77 r~ 54 

p(X > b) K(n obs s) 

and it coincides with the Bayes formula (52). 

In the frequentist approach the naive generalization of the inequality (22) is 

1 - (3' > P-(n obs \s + b) > a . (55) 

One can show that 

1-a - = P(n obs \X)dX < / P{n obs \X)dX . (56) 

•>b+s down Jb 

As a consequence of the inequality (56) the probability that the signal s lies in the interval 
< s < oo is equal to J b °° P(n obs \X')dX' and it is less than unity for nonzero background 
b > that contradicts to the intuition that the full probability that the signal s lies 
between zero and infinity must be equal to unity. To cure this drawback let us require 
thatS 

1 R > ^ P-{n obs \s + b) , 

1 - P > —5-7 rrr- > « • 57 

P-{n obs \b) 

The inequality (57) leads to the equations for the determination of Sd OW n and s up which 
coincide with the corresponding Bayes equations. The generalization of the inequalities 
(57) is straightforward, for instance the inequality (44) reads 

!_^>^^±^> a '. (58) 
P-{n obs \b; c k ) 

Upper limit on the signal s derived from the inequality (58) coincides with the upper limit 
in CL S method DEI El. 



4 Conclusions 

To conclude let us stress our main result. For Poisson distribution we have proposed 

modified frequentist definitions of the confidence interval and have shown the equivalence 
6 The interpretation of the inequality (57) is as follows. We can consider the P-(n b s \b) as the proba- 
bility that A > b. The ratio P p^"° 6 ^ S |^' ) is the conditional probability that A > b + s provided A > b. 
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of the modified frequentist approach and Bayes approach. It means in particular that 
frequentist approach is not unique. 

This work has been supported by RFBR grant N 10-02-00468. 
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